Skip to content

io_uring: bound injected completions per dispatcher tick#44665

Open
aburan28 wants to merge 4 commits into
envoyproxy:mainfrom
aburan28:aburan28/io-uring-bound-injected-completions
Open

io_uring: bound injected completions per dispatcher tick#44665
aburan28 wants to merge 4 commits into
envoyproxy:mainfrom
aburan28:aburan28/io-uring-bound-injected-completions

Conversation

@aburan28
Copy link
Copy Markdown

@aburan28 aburan28 commented Apr 27, 2026

Commit Message:
io_uring: bound injected completions per dispatcher tick

IoUringImpl::forEveryCompletion's injected-completion drain loop is unbounded:
a steady stream of injected completions — or callbacks that themselves inject
more completions — can stall the dispatcher thread indefinitely and starve
other event sources sharing the same loop. The TODO at io_uring_impl.cc:78
called this out explicitly.

This PR caps the per-tick drain at a configurable bound (default 1024). When
the cap is reached with completions still queued, forEveryCompletion writes
to the registered eventfd to re-arm the file event so processing resumes on
the next dispatcher tick. The eventfd is non-blocking and is fully drained at
the top of forEveryCompletion, so the self-poke is safe and idempotent.

Additional Description:

  • Default value (1024) is generous enough that it should not throttle
    realistic traffic; in steady state the loop drains the queue and never
    re-arms.
  • The cap is a constructor argument with a default, so all existing callers
    (IoUringWorkerImpl) are unchanged.
  • No proto/config surface change in this PR; making the cap operator-tunable
    can be a follow-up if anyone needs it.

AI usage disclosure: Portions of the code and/or PR description were drafted
with the assistance of Claude (Anthropic). I reviewed and understand all
submitted code.

Risk Level: Low
(Caps an existing unbounded loop with a generous default. Existing callers
are unchanged. In steady state this is a no-op.)

Testing:

  • Added IoUringImplTest.BoundedInjectedCompletionsPerEvent: injects 7
    completions with cap=3 and asserts the drain splits 3/3/1 across three
    dispatcher ticks, and that all 7 fire without any explicit activate()
    beyond the first — verifying the eventfd self-poke path.
  • Existing io_uring unit and integration tests on Linux CI.

Docs Changes: N/A

Release Notes: Added a bug_fixes entry in changelogs/current.yaml under
io_uring.

Platform Specific Features: N/A. io_uring is Linux-only and the affected
code already lives behind the existing io_uring build gating; this PR does
not change platform support.

Runtime guard: N/A. The new bound takes effect unconditionally with a
generous default; happy to add a guard if maintainers prefer.

The injected-completion drain loop in IoUringImpl::forEveryCompletion was
unbounded. A steady stream of injected completions, or completion callbacks
that inject more completions, could stall the dispatcher thread indefinitely
and starve other event sources sharing the same loop. The TODO at
io_uring_impl.cc:78 called this out explicitly.

Cap the per-tick drain at a configurable bound (default 1024). When the cap
is hit with completions still queued, write to the registered eventfd to
re-arm the file event so processing resumes on the next dispatcher tick.
The eventfd is non-blocking and is fully drained at the top of the method,
so the self-poke is safe.

Add a unit test that injects more completions than the cap, asserts the
drain is split across multiple ticks, and verifies all completions still
fire (proving the eventfd re-arm path works without an explicit activate()).

Signed-off-by: Adam Buran <aburan28@gmail.com>
@repokitteh-read-only
Copy link
Copy Markdown

Hi @aburan28, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #44665 was opened by aburan28.

see: more, trace.

@aburan28 aburan28 temporarily deployed to external-contributors April 27, 2026 00:21 — with GitHub Actions Inactive
@aburan28 aburan28 marked this pull request as draft April 27, 2026 00:23
@aburan28 aburan28 marked this pull request as ready for review April 27, 2026 01:36
ravenblackx
ravenblackx previously approved these changes Apr 27, 2026
Copy link
Copy Markdown
Contributor

@ravenblackx ravenblackx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like this improvement. One nitty request about the tests, but I'm marking it approved by me early since it's going to need a review from a senior maintainer anyway.

Comment thread test/common/io/io_uring_impl_test.cc Outdated
// Each tick after the first must have been driven by the eventfd self-poke since we never
// called activate() again — proving the re-arm path works.
EXPECT_EQ(7, total_completions);
EXPECT_GE(event_callback_invocations, 3u);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this EXPECT_EQ? I thought for a moment it might be because other unrelated events could increment it, but it's only incremented by this specific event, so it seems like it would always be exactly 3.

If there is a reason, please add a comment explaining it; if there is not, please make it EXPECT_EQ.

(Also for consistency either total_completions should be uint32_t or event_callback_invocations should be int32_t, they're both just counting up from zero so it's weird that they're different types.)

@ravenblackx
Copy link
Copy Markdown
Contributor

/assign-from @envoyproxy/senior-maintainers

@repokitteh-read-only
Copy link
Copy Markdown

@envoyproxy/senior-maintainers assignee is @zuercher

🐱

Caused by: a #44665 (comment) was created by @ravenblackx.

see: more, trace.

zuercher
zuercher previously approved these changes Apr 30, 2026
@phlax
Copy link
Copy Markdown
Member

phlax commented Apr 30, 2026

could you merge main please @aburan28

…ound-injected-completions

# Conflicts:
#	changelogs/current.yaml
@aburan28 aburan28 dismissed stale reviews from zuercher and ravenblackx via 448fb78 May 4, 2026 22:34
@aburan28 aburan28 had a problem deploying to external-contributors May 4, 2026 22:34 — with GitHub Actions Error
@aburan28 aburan28 had a problem deploying to external-contributors May 4, 2026 22:48 — with GitHub Actions Error
@zuercher
Copy link
Copy Markdown
Member

zuercher commented May 5, 2026

DCO needs to be fixed now. Also can you address the comments in ravenblackx's review?

@zuercher
Copy link
Copy Markdown
Member

zuercher commented May 5, 2026

/wait

aburan28 added 2 commits May 14, 2026 22:08
Per ravenblackx review: tighten EXPECT_GE to EXPECT_EQ since the
eventfd is only written by IoUring's self-poke, so the callback fires
exactly 3 times for 3+3+1. Also unify event_callback_invocations to
int32_t for consistency with the other counters.

Signed-off-by: Adam Buran <aburan28@gmail.com>
Signed-off-by: Adam Buran <aburan28@gmail.com>

# Conflicts:
#	source/common/io/io_uring_impl.cc
@aburan28 aburan28 force-pushed the aburan28/io-uring-bound-injected-completions branch from 548de3d to 39c3a9a Compare May 15, 2026 05:28
@aburan28 aburan28 requested a deployment to external-contributors May 15, 2026 05:28 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants